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ABSTRACT 

Weighted correlation functions are an increasingly important tool for understanding 
how galaxy properties depend on their separation from each other. We use a mock 
galaxy sample drawn from the Millenium simulation, assigning weights using a simple 
prescription to illustrate and explore how well a weighted correlation function recovers 
the true radial dependence of the input weights. We find that the use of a weighted 
correlation function results in a dilution of the magnitude of any radial dependence of 
properties and a smearing out of that radial dependence in radius, compared to the 
input behavior. We present a quantitative discussion of the dilution in the magnitude 
of radial dependence in properties in the special case of a constant enhancement at 
r < r c . In this particular case where there was a SFR enhancement at small radius 
r < r c = 35kpc, the matching of one member of an enhanced pair with an unenhanced 
galaxy in the same group gives an artificial enhancement out to large radius ~ 200 kpc. 
We compare this with observations of SFR enhancement from the SDSS (Li et al. 
2008; MNRAS, 385, 1903) finding very similar behavior — a significant enhancement 
at radii < 40 kpc and a weak enhancement out to more than 150 kpc. While we explore 
a particular case in this Letter, it is easy to see that the phenomenon is general, and 
precision analyses of weighted correlation functions will need to account carefully for 
this effect using simulated mock catalogs. 
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1 INTRODUCTION 

Correlation functions (2-point and higher order ones) have 
proved to be powerful statistical tools in order to address the 



study of the gala x y clus t ering (e. g. iPeebles fc Grothl Il97€ 
Groth k. Peebles! . Il977l ; iPeeblesl . 1 198(1 iDavis fc Peeble; 
1983) and are still widely used in both local ([Connolly et al. , 
20021 ; lEisenstein et all. 120051; iMasiedi et all. I2OO6I) and high 



redshift Universe i Giavalisco et al.l . ll998i ; lBlain et all 120041 . 

Studies of the two point correlation function have matured 
to the point that one can stu dy how galax i es pop ulate dark 
matter halos in detail (e.g., IZehavi et all 120041 ). the typi- 
cal halo masses of galaxy populations as a f unctio n of red- 
shift (e.g., Lyman breaks - IGiavalisco et all 1 19981) , the rel- 
ative clustering of different populations (e.g., the tendency 
of AGN to cluster like the massive galaxy population as a 
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whole; iLi et alll2006r ). and the use of clustering measures on 
the s mallest scale to constrain the me r ger history of galaxies 
(e.g., iPatton et all I2002L iBell et all I2OO6I . IRobaina et all 
|2010J1 . 

Furthermore, the correlation function method allows 
us not only to study the clustering of the galaxies them- 
selves, but also how some of th eir properties are clu stered. 
Weighted correlation functions IIBoerner et all ll989T) or in 
a ge n eral sense, mark e d stat is tics JB eisbart & Kerscher, 



20001; iGottlober et all |2002|; iFaltenbacher et all |200~ 



Skibba et all 120061 ; IRobaina et all |2009j) have been widely 



used in the last ten years in order to study how observables 
depend on the separation between galaxies. In particular, 
weighted correlation functions are frequently used to study 
the dependence of star formation rate (SFR) on separation 
between galaxies, in great part to explore the influence of 
galaxy interac t ions on enhancing a galaxy pair's SFR (e.g. 
ILi et all 120081 ; IRobaina et all 120091 ) . 
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The goal of this Letter is to explore the application 
of weighted correlation functions to study the variation of 
observables (e.g., SFR, color, AGN accretion rate, morphol- 
ogy) as a function of radius. We briefly introduce weighted 
2 point correlation functions in §2. We then construct a 
toy model with which we study the behavior of the inferred 
weighted quantities relative to the input behavior (§3). This 
toy model is primarily to illustrate some general features 
of how weighted correlation functions recover input behav- 
ior, and we stress that the framework discussed in this Let- 
ter applies generally to any application of weighted cor- 
relation function analysis, while noting that we choose to 
present a case that is most directly analagous to the study 
of SFR enhancement in close pairs of galaxies. We show 
the results of this analysis in §4. In §5, we briefly compare 
with observational results of SF enhancement derived us- 
ing the Sloan Digital Sky Survey l|Li et al.l . 12008). In §6, we 
present our conclusions. When necessary, we have assumed 
Ho = 70 km s"\ n m0 = 0.3, O\o = 0.7. 



2 BACKGROUND 

In this work, we explore the possible artifacts that the use of 
a marked correlation function could introduce when study- 
ing the clustering of galaxy properties. A full explanation 
of the methodology followed in this work has been already 
presented in iRobaina et al.l 1120031, and is s imila r to the 
metho dology adopted by ISkibba et alj l |200q l and iLi et al.l 
(2008); we summarize here the basics of the method but we 
refer the reader to those papers for a deeper explanation. 

The 2-point correlation function £(r) is the excess prob- 
ability of finding a galaxy at a given distance r from another 
galaxy: 



dP = n[l + £(r)]dV, 



(1) 



where dP is the probability of finding a galaxy in volume el- 
ement dV at a distance r from a galaxy, and n is the galaxy 
number density. A simple estimator of the unweighted cor- 
relation function is £(r) ~ DD/RR — 1, where DD is the 
histogram of separations between galaxies and RR is the 
histogram of separations between galaxies in a randomly- 
distributed catalog. In a similar way, one can estimate the 
weighted correlation function as W(r) ~ PP/PPr — 1, 
where PP is the weighted histogram of real galaxies and 
PPr the weighted histogram of separations from the cata- 
log with randomized coordinates. 

We choose to use an additive weighting scheme (the 
weight of the pair is the sum of the weights of individual 
galaxies) for concreteness (e.g.. lRobaina et al.l , 120 09), while 
noting that a multiplicative weighting would yield a quali- 
tatively similar result. Then, we can define the 'mark' E(r) 
as the excess clustering of the weighted correlation function 
compared to the unweighted correlation function: 



E(r) 



1 + W(r) 
1 + t(r) 



(2) 



study how the enhancement in a physical quantity caused by 
a galaxy-galaxy interaction (e.g., a SF enhancement) would 
be recovered by weighted 2-point correlation function tech- 
niques. We manually assign a weight (we refer to it as the 
mark) to every galaxy in the sample, giving a mark=l to 
galaxies which are not closer than r c = 35 kpc to any other 
galaxy and mark= e (with e > 1) to those galaxies which 
are in close, 3D pairs with separation r < r c kpc. For con- 
creteness, we consider simulated galaxies with stellar masses 
M* > 2.5 x 10 10 Mq, noting that the conclusions reached in 
this Letter are generally applicable, in a qualitative sense. 

We now examine how the marks of galaxy pairs relate 
to the actual behavior of the enhancement as a function of 
separation from their nearest neighbor. The mark is esti- 
mated by dividing the weighted correlation function by its 
unweighted counterpart, and recall that the correlation func- 
tion relates every galaxy to every other galaxy in the sample 
Q The weight is additive, and since every galaxy with a com- 
panion closer than r c has weight e, the mark of a close pair 
is 2e. Yet, the galaxies in this close pair will be matched 
also to every other galaxy in the sample. Therefore, when 
a galaxy in the same group or cluster at a distance r > r c 
from the enhanced pair is matched with a galaxy in the 
pair, the mark of that pair will be e + 1 (1 being the default 
weight of non-enhanced galaxies). We see that a pair with 
r > r c will show an enhancement when, in reality, there is 
no physical interaction-induced enhancement at that radius. 
As that third galaxy will be matched with both galaxies in 
the neighbor close pair, two pairs with mark=e + 1 will be 
contributed. Furthermore, imagine now that there is another 
real close pair of galaxies placed at several Mpc from the first 
close pair, in which both galaxies will also have mark=e. 
From matching all those 4 galaxies, the final product will be 
6 galaxy pairs displaying mark=2e. This will clearly affect 
both the normalization of the mark and the recovered value 
for the enhancement, producing a tail of false enhancement 
in the regions where more companions would be found (rep- 
resenting dense regions of the Universe) and decreasing the 
enhancement found at r < r c . 



4 RESULTS 

We show this effect in Fig. [T] Clearly, a relatively weak tail 
of enhancement is recovered out to large separations. The 
amplitude of this tail has a radial dependence, as close pairs 
of galaxies tend to b e found in dense regions of the Universe 
IjBarton et all 12007). As the magnitude of this tail depends 
on the distribution of neighbors as a function of the sepa- 
ration it will be more relevant for galaxy samples in which 
the clustering is stronger (e.g., massive galaxies, or non star- 
forming galaxies). 

Also visible in Fig. [TJ is the dilution of the recovered 
enhancement compared with the actual enhancement e for 
pairs with r < r c ; E(r < r c ). The value of E(r < r c ) is 
lower than the "real" enhancement e by a factor which in- 
creases with e. This effect is better seen in Fig. [2l where we 



3 AN IDEALISED EXPERIMENT 



We use |Pe Lucia et al.1 il2006f) catalog at z = d erived from 
the Millenium simulation ( Springel et all 120051 ) in order to 



1 Even in the case in which some criteria for pair-matching are 
imposed, like line-of-sight constraints, mass ratio, etc., one par- 
ticular galaxy will be matched with many secondaries at very 
different separations. 
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Figure 1. Apparent enhancement as a function of distance and 
"real" enhancement e (which acts only at r < r c kpc) computed 
for a sample with a minimum stellar mass of 2.5 X 1O 1O M0 and 
different values of e. The vertical dotted line shows the separation 
r c (35 kpc in this case). A tail of artifial enhancement extending 
to large separations is produced as an artifact of the weighted 
correlation function method. The enhancement recovered in the 
close pairs r < r c kpc is reduced respect to e and the level of 
reduction is a strong function of e (see text and Fig. [2] for more 
details) 



show the relative discrepancy between E(r < r c ) and e, as 
a function of e. In this idealised case, this discrepancy can 
be exactly recovered by accounting carefully for the differ- 
ent pairs formed by galaxies in the sample. The relationship 
between E(r < r c ) and e is: 



E(r < r c ) = 



e N, 



p,tot 



ir, 



cN CPirna x + WmpNmp + WfpNfp ' 



(3) 



where N Ptto t is the total number of pairs which can be 
formed from the galaxy samplqj, N cPtmax is the total num- 
ber of pairs which can be formed with galaxies belonging 
to close pairqj, W cp ,max is the weight associated with those 
pairs, N pm is the number of pairs in which only one galaxy 
belongs to a close pair, W mp is the weight associated with 
them, and Wf p and N p j ar are respectively the weight and 
the number of pairs in which none of the galaxies belongs 
to a close pair. 

In our particular case of an additive weight, this expres- 
sion reduces to: 

E{r < rc) = (,P + /)(e-l) + 2 ' (4) 

where / is the fraction of galaxies in close pairs. The degree 
of clustering of the sample is reflected in the value of /, so 
this expression is valid under different clustering conditions. 
For the purposes of this work, we calculate / directly from 
the mock catalogue, but real galaxy surveys lack of accu- 
rate 3D information. It is common to calculate / from the 



2 When performing an autocorrelation, the total number of 
unique pairs would be N(N — l)/2, N being the number of galax- 
ies in the sample. 

3 This is not the same as the number of close pairs, as we already 



explained. 
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Figure 2. Relative error in E(r < rc), the enhancement recov- 
ered by the marked correlation function in close pairs as a function 
of the "real" enhancement e in those pairs. For this example we 
have used, as in Fig. [l]a lower mass cut of 2.5 X 10 Mq. Dia- 
monds: Recovered values from the method. Solid line: Expected 
value using the proper normalization shown in Eq. [4] The error 
when the intrinsic enhancement is small is modest; when e < 4 
then the discrepancy between E(r < r c ) and e is < 10%. 



inferred rea l space correla t ion function by integr ating Eq. 
1 out to r c iMasiedi et all l200rj : [Bell et alll200d l. In their 
analysis, in the limit of small r c , and if the correlation func- 
tion is parametrised as a power law £(r) = (r/ro)~ 7 , then: 



P(r < r c ) = / = 



n[l + £(r)]dV 



f 



Aim 
7 

3-7 



,7„3-7 



(•>) 



(6) 



It is worth noting that in the above example we have 
studied the simple case in which the enhancement is present 
only in close galaxy pairs, with the enhancement represented 
by a step function. When applying weighted correlation 
functions to more complex problems, like those involving 
clustering of the mass or colour, the function describing the 
behavior of the weight on separation would be much more 
complex. In that case, an expression for the behavior of the 
weight as a function of separation will have to be derived on 
a case-by-case basis and matched with the data. Yet, even 
in that more complex case, the underlying problem is very 
similar: the magnitude of any radial dependence in proper- 
ties will be diluted and smeared out in radius by the use of 
weighted 2 point correlation function methods. 



5 AN EXAMPLE APPLICATION TO 
OBSERVATIONS 

In order to test the relevance of this analysis to the real Uni- 
verse, we compare our predictions with a well-established 
phenomenon: the enhancement of the star formation rate 
(SFR) in galaxy interactions. This observable has two obvi- 
ous advantages. Firstly, there are a num ber of works in which 
this enhancement has been s tudied (IBarton et all 2000; 
lLambas et ail 120031 ; iLi et all 120081 ; iRobaina et ail 120091 ). 
Second, the SFR is expected to be enhanced only at scales at 
which galaxy-galaxy interactions are relevant; beyond that 
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Figure 3. Specific SFR enhancement in (massive) gal axy pairs 
as fun ction of the projected separation as measured bv lLi et al.l 
(2008) (diamonds) and our prediction including the tail of artifi- 
cial enhancement out to several hundred kiloparsecs (solid line). 
In this example a value of e = 2 has been used for galaxies in 
pairs with rp < 15 kpc and e = 1.5 for galaxies in pairs with 
15 < rp < 40 kpc. The galaxy samp le has been sel ected to be 
consistent with the massive sample in lLi et al.l (2008). 



scale star formation is not only not expected to be enhanced, 
but should be depressed be cause of the well kn own SFR- 
density anticorrelation (e.g., iBalogh et al.l . 12002). Fro m the 
above mentioned works we choose to compare with lLi et al.l 
(2008) for three reasons: a) they use marked statistics, b) 
their large sample allowed an accurate estimate of enhance- 
ment to be made, and c) SDSS cluste ring has been shown to 
be similar to the one present in the iDe Lucia et al.l (|2006l ) 
mock catalogue from the Millenium Simulation in the local 
Universe (Springcl et al., 2005). 

Real galaxy surveys, even spectroscopic surve ys, have 
no acc ess to the real space separation of galaxies. iLi et al.l 
(2008) used a projected correlation function w(rp) to cir- 
cumvent this difficulty, where the projected correlation func- 
tion is related to the 3D correlation function via: 



w(r p ) = 



tiK ■ 



2,1/2 



)div, 



(7) 



where -k is the coordinate along the line of sight, and r p is the 
projected separation transverse to the line of sight. We use 
for this exercise galaxies more massi ve than 3 x 10 1 Mp) in 
order to match the selection citeria in lLi et al.l (|2008l ). More- 
over, they did not use an additive weight but used the SSFR 
of the primary galaxy as the weight of the pair. We also use 
such a sche me here to perf orm our weighted analysis in the 
simulation. ILi et al.l (|2008l ) calculated the cross-correlation 
between a subsample of galaxies which are forming stars 
(primaries) and all the galaxies in the sample (secondaries). 
As we lack of such information we run a correlation using 
all the galaxies as both primaries and secondaries. As pre- 
viously, we assign an average enhancement to all galaxies 
found physically in ver y close p a irs, b ut in order to mimic 
the the pair selection in lLi et al.l (]2008l ). who cross-correlate 
a sample of spectroscopically defined star forming galaxies 
with a photometric catalog of all galaxies above the stellar 
mass limit, we run the correlation function selecting galaxy 
pairs with "line-of-sight" separations of less than 100 Mpc. 



Our results are not sensitive to this choice of maximum 
separation; correlations between galaxies on scales larger 
than 100 Mpc are extremely weak, in comparison to the 
strong clustering on < IMpc scales. We choose to model the 
data with a constant enhancement e = 1.8 at r < r c , with 
r c — 35 kpc, Motivated by the star formation enhancement 
observed in galaxy samples selected in a similar manne r at 
different redshifts (|Li et all 120081 : iRobaina etUl l2009h we 
choose to model the data with e = 2 for galaxies in pairs 
with separations rp < 15 kpc and e = 1.5 for those in 
pairs with 15 < rp < 40 kpc. We also neglect any environ- 
mental suppression of star format i on at separations r > r c 
l|Barton et all . l2000l : IBalogh et aTl.l2002r i. These are clearly 
oversimplifications, as the real dependence of enhancement 
(and suppression at large radii) on separation will be con- 
siderably more complex. Yet, this simple model suffices to il- 
lustrate the recovered enhancement signature expected from 
a model in which SF is enhanced only at small radii. 

Notwithstanding these limitations, we compare the re- 
sults of our simple model with the data in Fig. [3] Strikingly, 
we find that the tail of enhanced SF out to ~200kpc seen 
in the data may, in great part, be a reflection of the use of 
marked correlation functions statistics to explore the radial 
dependence of SF enhancement in galaxies. This h as direct 
releva nce in the interpretation of the results from ILi et al.l 
(2008). If one argued that the enhancement at ~ 100 kpc 
(or much of it) was real, one would need to fulfil two cri- 
teria to produce such an effect. Firstly, assuming that the 
triggering event is the first pass, one would need an enhance- 
ment lifetime of at least 300Myr (longer than the internal 
dynamical time) for typical orbital velocities of 300km/s or 
less. Secondly, a significant fraction of the secondaries would 
need to have near-radial orbits in order to produce such an 
enhancement. If, as we suggest instead, the enhanced SF at 
~ 100 kpc is an artifact of the use of the 2 point correla- 
tion function, then one would argue that enhancement hap- 
pens only for close pairs and shorter interaction-induced SF 
timescales and a greater diversity of orbits would be permit- 
ted. While developing a model that realistically reproduces 
the data is beyond the scope of this Letter, one can clearly 
see that this effect needs to be accounted for in order to 
robustly interpret the behavior of marked correlation func- 
tions. 



6 CONCLUSIONS 

Weighted correlation functions are an increasingly impor- 
tant tool for understanding how galaxy properties depend on 
their separation from each other. We use a mock galaxy sam- 
ple drawn from the Millenium simulation, assigning weights 
using a simple prescription to illustrate and explore how 
well a weighted correlation function recovers the true ra- 
dial dependence of the input weights. We find that the use 
of a weighted correlation function results in a dilution of 
the magnitude of any radial dependence of properties and a 
smearing out of that radial dependence in radius, compared 
to the input behavior. We present a quantitative discussion 
of the dilution in the magnitude of radial dependence in 
properties in the special case of a constant enhancement e 
for pairs separated by r < r c . In this particular case the 
matching of one member of an enhanced pair with an un- 
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enhanced galaxy in the same group gives an artificial en- 
hancement ~ O.le out to large radii > 5r c , and matches 
of one member of an enhanced pair with a member of an- 
other very distant enhanced pair pulls down the value of the 
recovered enhancement, with the discrepancy between the 
input and recovered enhancement being a function of the 
fraction of galaxies in close pairs and the value of the input 
enhancement. This systematic error is < 10% for enhance- 
ments e < 4, but precision measurements should account 
for this effect. We compare these res ults with observ ations 
of SFR enhancement from the SDSS lLi etahl (|200Sf ), find- 
ing very similar behavior — a significant enhancement at 
radii < 40kpc and a weak enhancement out to more than 
150 kpc, lending credibility to the notion that weak enhance- 
ment in SFR seen out to large radii is an artifact of the use of 
weighted correlation function statistics. While we explored 
a particular case in this Letter, it is easy to see that the 
phenomenon is general. 

Given this difference between input weights and those 
recovered by the weighted 2 point correlation function, one 
might ask if one shouldn't use a different method to explore 
radial trends in observables. We would argue that most dif- 
ferent methods boil down to weighted 2 point correlation 
functions implicitly anyway, and that one is stuck at least 
at the qualitative level with the differences between input 
and recovered weights that we have discussed above. For ex- 
ample, partnering projected pairs into different 'pairs' (i.e., 
not matching every galaxy with every other galaxy) suffers 
from two drawbacks: this is still a projected analysis, and 
many projected close pairs will be separated by significant 
distances along the line of sight; and second, one may choose 
the wrong galaxy to partner with, a particularly acute issue 
for triplets or groups of galaxies. One can see that such a 
method will suffer from a similar supression of enhancement 
from the inclusion of non-pairs in the pair sample; of course, 
radial smearing is not possible in such a case, as there is only 
one radial bin. We conclude that those wishing to quanti- 
tatively analyze weighted correlation functions (or related 
observables) will need to account carefully for this effect us- 
ing an analysis of simulated mock catalogs. 
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